Acorn RISC PD-CD 1

home *** CD-ROM | disk | FTP | other *** search

/ Acorn RISC PD-CD 1 / Acorn RISC PD-CD 1.iso / languages / hope / hopetutorl < prev

Wrap

Text File | 1990-04-25 | 55KB | 2,113 lines

A HOPE TUTORIAL Roger Bailey, Imperial College Functions In Conventional Languages: In a language like Pascal, a function is a piece of 'packaged' program for performing standard operations like finding square roots. To obtain the square root of a positive number stored in a variable x, we write: sqrt ( x ) at the point in the program where we want the value, such as: writeln ( 1.0 + sqrt ( x ) ) ; this is called an application of the function. The value represented by "x" is called the argument or actual parameter. In this context, the canned program computes the square root of "x", "1.0" is added to it and the result is then printed. We can also define our own functions specifying how the result is computed using ordinary Pascal statements. Here's a function that returns the greater of its two argument values: function max ( x, y : INTEGER ) : INTEGER ; begin if x > y then max := x else max := y end ; The identifiers "x" and "y" are called formal parameters. They're used inside the definition to name the two values that will be supplied as arguments when the function is applied. We can use "max" anywhere we need a value, just like "sqrt". Here's how we might use "max" to filter out negative values on output: writeln ( max ( z, 0 ) ) ; Hope Tutorial Page 1 A more interesting case is when the actual parameter is a function application itself or involves one. We can use "max" to find the largest of three numbers by writing: max ( a, max ( b, c ) ) Combining functions together like this is called composition. The expression is evaluated 'inside-out' because the outer application of "max" can't be evaluated until the value of its second argument is known. The inner application of "max" is therefore evaluated first using the values of "b" and "c" and the result is used as the actual parameter of the outer application. Another way of combining functions together is to define more powerful ones using simpler ones as 'building blocks'. If we often need to find the largest of three numbers we might define: function MaxOf3 ( x, y, z : INTEGER ) : INTEGER ; begin MaxOf3 := max ( x, max ( y, z ) ) end ; and apply it by writing: MaxOf3 ( a, b, c ) Programming with functions Pascal is called an imperative language because programs written in it are recipes for 'doing something'. If our programs consist only of functions, we can concentrate on what the results are and ignore how they're computed. Forget that "sqrt" is a piece of code and think of "sqrt ( x )" as a way of writing a value in your program, and you'll get the idea. You can think of "MaxOf3" like this as well if you ignore the way it works inside. By defining a 'toolkit' of useful functions and combining them together like this, we can build powerful programs that are quite short and easy to understand. Hope Tutorial Page 2 In Pascal, functions can only return 'simple' data objects such as numbers or characters, but real programs use big data structures and can't easily be written using these functions. In Hope, functions can return any type of value, including data structures equivalent to Pascal's arrays and records and much more. Programming in Hope has the flavour of simply 'writing down the answer' by writing an expression that defines it. This will contain one or more function applications to define smaller parts of the answer. These functions won't usually be built in like "sqrt", so we'll have to define them ourselves, but we'll still think of them as definitions of data objects, and not as algorithms for computing them. A simple Hope example -- conditionals Let's see how we can define "max" in Hope. Like Pascal, it's a strongly-typed language, which means we must tell the compiler about the types of all objects in our programs so it can check that they're used consistently. The function definition comes in two parts. First we declare the argument and result types: dec max : num # num -> num ; "dec" is in bold face because it's a reserved word and you can't use it as a name. Type it in lower case when entering programs. "max" is the name of the function being defined. Names consist of upper and lower case letters (which are distinct) and digits, and must start with a letter. The current fashion is to use lower case. The layout isn't significant and you can separate symbols with any number of blanks, tabs and newlines for clarity, as in this example. Symbols need only be separated when they might otherwise be confused as one, such as "dec" and "max". The next part of the declaration gives the types of the arguments (read the symbol ":" as 'takes a'). Non-negative integers are of the predefined type "num" (in lower case). "#" is read as 'and a'; (or you can use the reserved word "X"). "->" is read as 'yields'. The semicolon marks the end of the declaration, which tells the compiler that "max" takes two numbers as arguments and returns a single number as its result. The result of a function is defined by one or more recursion equations. "max" needs only one equation to define it: --- max ( x, y ) <= if x > y then x else y ; Hope Tutorial Page 3 Read the symbol "---" as 'the value of'. The expression "max ( x, y )" is called the left-hand side of the equation. It defines "x" and "y" as formal parameters, or local names for the values that will be supplied when the function is applied. Parameter names are local to the equation, so "x" and "y" won't be confused with any other "x" or "y" in the program. The symbol "<=" is read as 'is defined as'. The rest of the equation (called the right-hand side) defines the result. It's a conditional expression. The symbols "if", "then" and "else" are reserved words. Pascal's conditional statement chooses between alternative actions, but Hope's conditional expression chooses between alternative values, in line with our view that function applications are ways of writing values rather than recipes for computing them. If the value of the expression "x > y" is "true", the value of the whole conditional expression is the value of "x", otherwise it's the value of "y". The alternative values can be defined by any Hope expressions. When the value of a function is defined by more than one expression like this, they are evaluated in an unspecified order. On a suitable computer, such as the Imperial College ALICE machine, it's even possible to evaluate both expressions and the test in parallel and throw away one of the values according to the result of the test. Using functions that we've defined A Hope program is just a a single expression containing one or more function applications composed together. It's evaluated immediately and the result and its type are printed on the screen. Here's a simple program that uses "max", with its output (which is shown in italics): max ( 10, 20 ) + max ( 1, max ( 2,3 ) ) ; 23 : num The rules for evaluating the expression are the same as those of Pascal: function arguments are evaluated first, the functions are applied, and finally other operations are performed in the usual order of priority. Hope Tutorial Page 4 We can also use existing functions to define new ones. Here's the Hope version of "MaxOf3": dec MaxOf3 : num # num # num -> num ; --- MaxOf3 ( x, y, z ) <= max ( x, max ( y, z ) ) ; A more interesting example - repetition Just as Pascal's conditional statement is replaced by Hope's conditional value, so the repetitive statement is replaced by the repetitive value. Here's a Pascal function that multiplies two numbers using repeated addition: function mult ( x, y : INTEGER ) : INTEGER ; var prod : INTEGER ; begin prod := 0 ; while y > 0 do begin prod := prod + x ; y := y - 1 end ; mult := prod end ; It's hard to be sure this function does enough additions (it took me three tries to get it right) and this seems to be a general problem with loops in programs. A common way of checking imperative programs is to simulate their execution. If we do this for input values of 2 and 3, we'll find that "prod" starts with the value 0 and gets values of 2, 4 and 6 on successive loop iterations, which suggests the definition is correct. Hope Tutorial Page 5 Hope doesn't have any loops, so we must write all the additions that the Pascal program performed in a single expression. It's much easier to see that this has the right number of additions: dec mult : num # num -> num ; --- mult ( x, y ) <= 0 + x + x + ... or would be if we knew how many times to write "+ x". The hand simulation suggests we need to write it "y" times, which is tricky when we don't know the value of "y". What we do know is that for a given value of "y", the expressions: mult ( x, y ) and mult ( x, y-1 ) + x will have the same number of "+ x" terms if written out in full. The second one always has two terms, whatever the value of "y", so we'll use it as the definition of "mult": --- mult ( x, y ) <= mult ( x, y-1 ) + x ; On the face of it we've written something ridiculous, because it means we must apply "mult" to find the value of "mult". Remember however that this is really shorthand for "0" followed by "y" occurrences of "+ x". When "y" is zero, the result of "mult" is also zero because there are no "+ x" terms. In this case "mult" isn't defined in terms of itself, so if we add a special test for it, the definition terminates. A usable definition of "mult" is: --- mult ( x, y ) <= if y = 0 then 0 else mult ( x, y-1 ) + x ; Functions that are defined using themselves like this are called recursive. Every Pascal program using a loop can be expressed as a recursive function in Hope. All recursive definitions need one case (called the base case) where the function isn't defined in terms of itself, just as Pascal loops need a terminating condition. Hope Tutorial Page 6 Another way of using functions Hope allows us to use a function with two arguments like "mult" as an infix operator. We must assign it a priority and use it as an operator everywhere including the equations that definine it. The definition of "mult" when used as an infix operator looks like this: infix mult : 8 ; dec mult : num # num -> num ; --- x mult y <= if y = 0 then 0 else x mult ( y - 1 ) + x ; A bigger number in the infix declaration means a higher priority. The second argument of "mult" is parenthesised because its priority of 8 is greater than the built-in subtraction operation. Most of Hope's standard functions are supplied as infix operators. Other kinds of data Hope provides two other primitive data types. A "truval" (truth value) is equivalent to a Pascal Boolean and has values "true" and "false". We've already seen the expression "x > y" defining a truth value. ">" is a standard function whose type is "num # num -> truval". We can use truth values in conditional expressions and combine them together with the standard functions "and", "or" and "not". Single characters are of type "char", with values "'a'", "'b'" and so on. Characters are most useful as components of data structures such as character-strings. Hope Tutorial Page 7 Data structures Practical programs need data structures and Hope has two standard kinds already built in. The simplest kind corresponds to a Pascal record. We can bind a fixed number of objects of any type together into a structure called a tuple, for example: ( 2, 3 ) or ( 'a', true ) are tuples of type "num # num" and "char # truval" respectively. We use tuples when we want a function to define more than one value. Here's one that defines the time of day given the number of seconds since midnight: dec time24 : num -> num # num # num ; --- time24 ( s ) <= ( s div 3600, s mod 3600 div 60, s mod 3600 mod 60 ) ; "div" is the built-in integer division function and "mod" gives the remainder after integer division. If we type an application of "time24" at the terminal, the result tuple and its type will be printed on the screen in the usual way: time24 ( 45756 ) ; ( 12,42,36 ) : ( num # num # num ) Hope Tutorial Page 8 The second standard data type, called a list, corresponds roughly to a one-dimensional array in Pascal. It can contain any number of objects (including none at all) but they must all be the same type. We can write expressions in our programs that represent lists, such as: [ 1, 2, 3 ] which is of type "list ( num )". There are two standard functions for defining lists. The infix operator "::" (read as 'cons') defines a list in terms of a single object and list containing the same type of object, so 10 :: [ 20, 30, 40 ] defines the list: [ 10, 20, 30, 40 ] Don't think of "::" as adding "10" to the front of "[ 20, 30, 40 ]". It really definines a new list "[ 10, 20, 30, 40 ]" in terms of two other objects without changing their meaning, rather in the same way that "1 + 3" defines a new value of "4" without changing the meaning of "1" or "3". The other standard list function is "nil", which defines a list with no elements in it. We can represent every list by an expression consisting of applications of "::" and "nil". When we write an expression like: [ a + 1, b - 2, c * d ] it's considered to be a shorthand way of writing: a + 1 :: ( b - 2 :: ( c * d :: nil ) ) There's also a shorthand way of writing lists of characters. The following three expressions are all equivalent: "cat" [ 'c', 'a', 't' ] 'c' :: ( 'a' :: ( 't' :: nil ) ) Hope Tutorial Page 9 When the result of a Hope program is a list, it's always printed out in the concise bracketed notation; if it's a list of characters, it's printed in quotes. Every data type in Hope is defined by a set of primitive functions like "::" and "nil". They're called constructor functions, and aren't defined by recursion equations. When we defined a tuple, we were actually using a standard constructor called "," (read as 'comma'). Later on we'll see how constructors are defined for other types of data. Functions that define lists If we wanted to write a Pascal program to print the first n natural numbers in descending order we'd probably write a loop that printed one value out on each iteration, such as: for i := n downto 1 do write ( i ) ; In Hope we write one expression that defines all the values at once, rather like we did for "mult": dec nats : num -> list ( num ) ; --- nats ( n ) <= if n = 0 then nil else n :: nats ( n-1 ) ; "nil" is useful for writing the base case of a recursive function that defines a list. If we try the function at the terminal by typing: nats ( 10 ) ; [ 10,9,8,7,6,5,4,3,2,1 ] : list ( num ) we can see that the numbers are in descending order because that's the way we arranged them in the list, and not because they were defined in that order. The values in the expression defining the list are treated as though they were all generated at the same time. On the ALICE machine they actually are generated at the same time. Hope Tutorial Page 10 To get the results of a Hope program in the right order, we must put them in the right place in the final data structure. If we want the list of the numbers "n" through "1" in the opposite order we can't write the definition as: ... else nats ( n-1 ) :: n ; because the argument types for "::" are the wrong way round. We need to use a another built-in operation "<>" (read as 'append') that concatenates two lists. The definition will then look like this: --- nats ( n ) <= if n = 0 then nil else nats ( n-1 ) <> [ n ] ; We put "n" in brackets to make it into a (single-item) list because "<>" expects both its arguments to be lists. We could also have written "( n :: nil )" instead of "[ n ]". Data structures as parameters Suppose we have a list of integers and we want to write a function to add up all its elements. The declaration will look like this: dec sumlist : list ( num ) -> num ; We need to refer to the individual elements of the actual parameter in the equations defining "sumlist". We do this using an equation whose left-hand side looks like this: --- sumlist ( x :: y ) ... Hope Tutorial Page 11 This is an expression involving list constructors and corresponds to an actual parameter that's a list. "x" and "y" are formal parameters, but they now name individual parts of the actual parameter value. In an application of "sumlist" like: sumlist ( [ 1, 2, 3 ] ) the actual parameter will be 'dismantled' so that "x" names the value "1" and "y" names the value "[ 2, 3 ]". The complete equation will be: --- sumlist ( x :: y ) <= x + sumlist ( y ) ; Notice there's no base case test. As we might expect, it's the empty list, but we can't test for it directly in the equation because there's no formal parameter that refers to the whole list. In fact, if we write the application: sumlist ( nil ) we'll get an error message because we can't dismantle "nil" to find the values of "x" and "y". We must cover this case separately using a second recursion equation: --- sumlist ( nil ) <= 0 ; The two equations can be given in either order. When "sumlist" is applied, the actual parameter is examined to see which constructor function was used to define it. If the actual parameter is a non-empty list, the first equation is used, because non-empty lists are defined using the "::" constructor. The first number in the list gets named "x" and the remaining list "y". If the actual parameter is the empty list, the second equation is be used because empty lists are defined using the constructor "nil". Hope Tutorial Page 12 Pattern-matching An expression composed of constructors appearing on the left-hand side of a recursion equation is called a pattern. Selecting the right recursion equation and dismantling the actual parameter to name its parts is called pattern-matching. When you write a function, you must give a recursion equation for each possible constructor defining the argument type. Sometimes we don't need to dismantle the actual parameter, and we can use a formal parameter in the pattern that matches the whole object, irrespective of what constructors were used to define it. As an example, let's see how we could define our own function to concatenate two lists like the built-in operation "<>": infix cat : 4 ; dec cat : list( num ) # list( num ) -> list ( num ) ; --- ( h :: t ) cat l <= h :: ( t cat l ) ; --- nil cat l <= l ; The first list parameter is matched by the pattern "( h :: t )" so that its first item (the 'head") and the remaining list (the 'tail') can be referred to separately on the right-hand side. The second recursion equation covers the case when the first list is empty. The second list parameter is matched by the pattern "l" whether it's empty or not. As well as writing enough recursion equations to satisfy all the parameter constructors, we must also be careful not to write sets of equations where more than one pattern might match the actual parameters, because that would be ambiguous. We can write patterns to match arguments that are tuples in the same way using the tuple constructor ",". When we wrote "mult ( x, y )" you probably thought the parentheses and the comma were something to do with the function application. In fact, we were constructing a tuple and the parentheses were only needed because "," has a low priority. Hope treats all functions as having only one argument. This can be a tuple when you want the effect of several arguments. Without parentheses, mult x, y would be interpreted as: ( mult ( x ), y ) Hope Tutorial Page 13 A recursion equation with the left-hand side: --- mult ( x, y ) <= ... is just a pattern-match on a tuple. The first item in the tuple gets named "x" and the second one "y". We can also use pattern-matching on "num" parameters. These are defined by two constructors called "succ" and "0". "succ" defines a number in terms of the next lower one. "0" has no arguments and defines the value zero. Surely "0" is a value, not a function? Well, we're already used to thinking of function applications as another way of writing values, so it's quite consistent to think of "0" as a function application. Here's a version of "mult" that uses pattern-matching to identify the base case: infix mult : 8 ; dec mult : num # num -> num ; --- x mult 0 <= 0 ; --- x mult succ ( y ) <= ( x mult y ) + x ; We can read "succ ( y )" as 'the successor of some number that we'll call "y"'. Instead of naming the actual parameter "y" like we did in the original version of "mult", we're naming its predecessor. Hope Tutorial Page 14 Simplifying expressions In Pascal programs we can simplify complex expressions by removing common sub-expressions and evaluating them separately. Instead of: writeln ( ( x + y ) * ( x + y ) ) ; we would probably write: z := x + y ; writeln ( z * z ) ; which is clearer and more efficient. Hope programs consist only of expressions and it's even more important to simplify them. We do this by using a qualified expression: let z == x + y in z * z ; This looks like an assignment, but it isn't. "==" is read as 'is defined as' and "z" is local to the expression following the "in". If we write something like: let z == z + 1 in z * z ; we're actually introducing a new variable "z" to use in the sub-expression "z * z". It hides the original one in the sub-expression "z + 1". There's a second form of qualified expression for people who like to use variables first and define their meanings later. It looks like this: z * z where z == x + y ; The result of the qualified expression is the same whether we define it using let or where. "x + y" is evaluated first, and its value is used in the main expression. Hope Tutorial Page 15 The qualifying expression will often be a function application that defines a data structure. If we want to name part of the structure we can use a pattern on the left-hand side of the "==" symbol: dec time12 : num -> num # num ; --- time12 ( s ) <= ( if h > 12 then h-12 else h, m ) where ( h, m, s ) == time24 ( s ) ; We'll use this construction most often when we write recursive functions that define tuples. Here's a typical example. Suppose we want to form a string of words from a sentence. For simplicity a word is taken to be any sequence of characters, and words are separated in the sentence by any number of blanks. The sentence and a single word will be of type "list ( char )" and the final sequence of words a "list ( list ( char ) )". It's fairly straightforward to obtain the first word. Here's a function that does it: dec firsttry : list ( char ) -> list ( char ) ; --- firsttry ( nil ) <= nil ; --- firsttry ( c :: s ) <= if c = ' ' then nil else c :: firsttry ( s ) ; One of the nice features of Hope is that we can type in and print out any kind of value, so it's easy to check out the individual functions of our program separately. If we test "firsttry" we'll see: firsttry ( "You may hunt it with forks and Hope" ) ; "You" : list ( char ) But there's a problem here because we're going to need the rest of the sentence if we're to find the remaining words. We must arrange that that the function returns the remaining list as well as the first word. This is where tuples come in: dec firstword : list ( char ) -> list ( char ) # list ( char ) ; --- firstword ( nil ) <= ( nil, nil ) ; --- firstword ( c :: s ) <= if c = ' ' then ( nil, s ) else ( ( c :: w, r ) where ( w, r ) == firstword ( s ) ) ; Hope Tutorial Page 16 The qualified expression is parenthesised so it only applies to the expression after the else, otherwise we'll evaluate "firstword" recursivly as long as the sentence is non-empty, even if it starts with a blank. This version of the function produces: firstword ( "Hope springs eternal ..." ) ; ( "Hope","springs eternal ..." ) : ( list ( char ) # list ( char ) ) We can use this to define a function to split the sentence into a list of its individual words: dec wordlist : list ( char ) -> list ( list ( char ) ) ; --- wordlist ( nil ) <= nil ; --- wordlist ( c :: s ) <= if c = ' ' then wordlist( s ) else ( w :: wordlist ( r ) where ( w, r ) == firstword ( c :: s ) ) ; which we can test by typing an application at the terminal: wordlist ( " While there's life there's Hope " ) ; [ "While","there's","life","there's","Hope" ] : list ( list ( char ) ) Review So far we've concentrated on features of Hope that have something in common with traditional languages such as Pascal, but without many of their limitations, such as fixed-size data structures. We've also been introduced to the functional style of programming where programs are no longer recipes for action, but just definitions of data objects. Now we'll introduce features of Hope that lift it onto a much higher level of expressive power, and let us write programs that are not only extremely powerful and concise, but that can be checked for correctness at compile-time and mechanically transformed into more efficient versions. Hope Tutorial Page 17 Making functions more powerful The Hope compiler can spot many common kinds of error by checking the types of all objects in expressions. This is harder than checking at run-time, but more efficient and saves the embarrassment of discovering an error at run-time in a rarely-executed branch of the air traffic control system we just wrote. However, strict type-checking can be a nuisance if we want to perform some operation that doesn't depend on the type of the data. Try writing a Pascal procedure to reverse an array of either 10 integers or 10 characters and you'll see what I mean. Hope avoids this kind of restriction by allowing a function to operate on more than one type of object. We've already used the standard constructors "::" and "nil" to define a "list ( num )", a "list ( char )" and a "list ( list ( char ) )". The standard equality function "=" will compare any two objects of the same type. Functions with this property are called polymorphic. Pascal's built-in functions "abs" and "sqr" and operations like ">" and "=" are polymorphic in a primitive kind of way. We can define our own polymorphic functions in Hope. The function "cat" we defined above will concatenate lists of numbers, but we can use it for lists containing any type of object. To do this we first declare a kind of 'universal type' called a type variable. We use this in the declaration of "cat" where it stands for any actual type: typevar alpha ; infix cat : 8 ; dec cat : list ( alpha ) # list ( alpha ) -> list ( alpha ) ; This says that "cat" has two parameters that are lists and defines a list, but it doesn't say what kind of object is in the list. However, "alpha" always stands for the same type throughout a given declaration, so all the lists must contain the same type of object. The expressions: [ 1,2,3 ] cat [ 4,5,6 ] and "123" cat "456" are correctly-typed applications of "cat" and define a "list ( num )" and a "list ( char )" respectively, while the expression: [ 1,2,3 ] cat "456" Hope Tutorial Page 18 isn't because "alpha" can't be interpreted as two different types. The interpretation of a type variable is local to a declaration so it can have different interpretations in other declarations without confusion. Of course it only makes sense for a function to be polymorphic as long as the equations defining it don't make any assumptions about types. In the case of "cat" the definition uses only "::" and "nil", which are polymorphic themselves. However, a function like "sumlist" uses "+" and can only be used with lists of numbers as parameters. Defining your own data types Tuples and lists are quite powerful, but for more sophisticated applications, we'll need to define our own types. User-defined types make programs clearer and help the type-checker to help the programmer. We introduce a new data type in a data declaration: data vague == yes ++ no ++ maybe ; "data" is a reserved word and "vague" is the name of the new type. "==" is read as 'is defined as' and "++" is read as 'or'. "yes", "no" and "maybe" are the names for the constructor functions of the new type. We can now write function definitions that use these constructors in patterns: dec evade : vague -> vague ; --- evade ( yes ) <= maybe ; --- evade ( maybe ) <= no ; The constructors can be parameterised with any type of object, including the type that's being defined. We can define types like lists, whose objects are of unlimited size using this kind of recursive definition. As an example, here's a user-defined binary tree that can contain numbers as its leaves: data tree == empty ++ tip ( num ) ++ node ( tree # tree ) ; Hope Tutorial Page 19 There are three constructors. "empty" has no parameters and defines a tree with nothing in it. "tip" defines a tree in terms of a single num, and "node" defines a tree in terms of two other trees. Here's a typical tree: . / \ / \ ___/ \___ ________/ \________ / / \ \ / /\ /\ \ 1 / \ / \ \ / \ / \ 5 2 3 . 4 Here's an example of a function that manipulates trees. It returns the sum of all the numbers contained in one: dec sumtree : tree -> num ; --- sumtree ( empty ) <= 0 ; --- sumtree ( tip ( n ) ) <= n ; --- sumtree ( node ( l, r ) ) <= sumtree ( l ) + sumtree ( r ) ; Unfortunately there's no shorthand for writing tree constants like there is for list constants, so we've got to write them out the long way using constructors. If we want to use "sumtree" to add up all the numbers in the example tree above, we must type in the expression: sumtree ( node ( node ( tip ( 1 ), node ( tip ( 2 ), tip ( 3 ) ) ), node ( node ( empty, tip ( 4 ) ), tip ( 5 ) ) ) ) ; This isn't really a drawback, because programs that manipulate complex data structures like trees will generally define them using other functions. However, it's very useful to be able to type any kind of constant data structure at the terminal when we're checking out an individual function like "sumtree". When we want to test a Pascal program piecemeal, we usually have to write elaborate test harnesses or stubs to generate test data. Hope Tutorial Page 20 Making data more abstract The identifier "list" isn't really a Hope data type. It's called a type constructor and must be parameterised with an actual type before it represents one. We did this every time we declared a "list ( num )" or a "list ( char )". The parameter can also be a user-defined type, as with a "list ( tree )" or even a type variable as in "list ( alpha )", which defines a polymorphic data type. Constructing new data types like this is a compile-time operation by the way, and you shouldn't confuse it with constructing new data values, which is a run-time operation. You can define your own polymorphic data types. Here's a version of the binary tree we defined earlier that can have any type of value in its leaves: data tree ( alpha ) == empty ++ tip ( alpha ) ++ node ( tree ( alpha ) # tree ( alpha ) ) ; Once again, "alpha" is taken to be the same type throughout one instance of a tree. If it's a number, then all references to "tree ( alpha )" are taken as references to "tree ( num )". We can define polymorphic functions that operate on trees containing any type of object, because tree constructors are now polymorphic. Here's a function to 'flatten' a binary tree into a list of the same type of object: dec flatten : tree ( alpha ) -> list ( alpha ) ; --- flatten ( empty ) <= nil ; --- flatten ( tip ( x ) ) <= x :: nil ; --- flatten ( node ( x, y ) ) <= flatten ( x ) <> flatten ( y ) ; Hope Tutorial Page 21 We can demonstrate it on various kinds of tree: flatten( node ( tip ( 1 ), node ( tip ( 2 ), tip ( 3 ) ) ) ) ; [ 1, 2, 3 ] : list ( num ) flatten( node ( tip ( "one" ), node ( tip ( "two" ), tip ( "three" ) ) ) ) ; [ "one","two","three" ] : list ( list ( char ) ) flatten( node ( tip ( tip ( 'a' ) ), node ( tip ( empty ), tip ( node ( tip ( 'c' ), empty ) ) ) ) ) ; [ tip ( 'a' ),empty,node ( tip ( 'c' ), empty) ] : list ( tree ( char ) ) Notice how the type-checker may need to go through several levels of data types to deduce the type of the result. Hope Tutorial Page 22 Even more concise programs The importance of polymorphic types and functions is that they let us write shorter, clearer programs. It's similar to the way Pascal procedures let us use the same code to operate on different data values, but much more powerful. We can write a single Hope function to reverse a list of numbers or characters, where we'd need to write two identical Pascal subroutines to reverse an array of integers and an array of characters. We can use polymorphic functions whenever we're concerned only with the arrangement of the objects in a data structure and not with their values. Sometimes, we'll want to apply some function to the primitive data items in the structure. Here's a function that uses a second function "square" to define a "list (num)" whose elements are the squares of another "list (num)": dec square : num -> num ; --- square ( n ) <= n * n ; dec squarelist : list ( num ) -> list ( num ) ; --- squarelist ( nil ) <= nil ; --- squarelist ( n :: l ) <= square ( n ) :: squarelist ( l ) ; Every time we write a function to process every element of a list, we'll write something almost identical to "squarelist". Here's a function to define a list of factorials: dec fact : num -> num ; --- fact ( 0 ) <= 1 ; --- fact ( succ ( n ) ) <= succ ( n ) * fact ( n ) ; dec factlist : list ( num ) -> list ( num ) ; --- factlist ( nil ) <= nil ; --- factlist ( n :: l ) <= fact ( n ) :: factlist ( l ) ; "factlist" has exactly the same 'shape' as "squarelist", it just applies "fact" instead of "square" and then applies itself recursively. Values that differ between applications are usually supplied as actual parameters. Hope treats functions as data objects, so we can do this in a perfectly natural way. A function that can take another function as an actual parameter is called a higher-order function. When we declare it we must give the type of formal parameter standing for the function in the usual way. The declaration of "fact" tells us that it's: num -> num Hope Tutorial Page 23 Read this as 'a function mapping numbers to numbers'. Now let's see how we can use this idea to write "factlist" and "squarelist" as a single higher- order function. The new function needs two parameters, the original list, and the function that's applied inside it. Its declaration will be: dec alllist : list ( num ) # ( num -> num ) -> list ( num ) ; The 'shape' of "alllist" will be the same as "factlist" and "squarelist", but the function we apply to each element of the list will be the formal parameter: --- alllist ( nil, f ) <= nil ; --- alllist ( n :: l, f ) <= f ( n ) :: alllist ( l, f ) ; We use "alllist" like this: alllist ( [ 2,4,6 ], square ) ; [ 4,16,36 ] : list ( num ) alllist ( [ 2,4,6 ], fact ) ; [ 2,24,720 ] : list ( num ) Notice that there's no argument list after "square" or "fact" in the application of "alllist", so this construction won't be confused with functional composition. "fact( 3 )" represents a function application, but "fact" by itself represents the unevaluated function. Higher-order functions can also be polymorphic. We can use this idea to write a more powerful version of "alllist" that will apply an arbitrary function to every element of a list of objects of arbitrary type. This version of the function is usually known as "map": typevar alpha, beta ; dec map : list ( alpha ) # ( alpha -> beta ) -> list ( beta ) ; --- map ( nil, f ) <= nil ; --- map ( n :: l, f ) <= f ( n ) :: map ( l, f ) ; The definition now uses two type variables "alpha" and "beta". Each one represents the same actual type throughout one instance of "map", but the two types can be different. This means we can use any function that maps alphas to betas to generate a list of betas from any list of alphas. Hope Tutorial Page 24 The actual types aren't restricted to scalars, which makes "map" rather more powerful than we might realise at first sight. Suppose we've got a suitably polymorphic function that finds the length of a list: typevar gamma ; dec length : list ( gamma ) -> num ; --- length ( nil ) <= 0 ; --- length ( n :: l ) <= 1 + length ( l ) ; length ( [ 2,4,6,8 ] ) + length ( "cat" ) ; 7 : num We can use "map" to apply "length" to every element of a list of words defined by "wordlist": map ( wordlist ( "The form remains, the function never dies" ), length ) [ 3,4,8,3,8,5,4 ] : list ( num ) ; In this example "alpha" is taken to be a "list ( char )" and "beta" to be a num, so the type of the function must be "( list ( char ) -> num )". "length" fits the bill if "gamma" is taken to be a character. Hope Tutorial Page 25 Common patterns of recursion "map" is powerful because it sums up a pattern of recursion that turns up frequently in Hope programs. We can see another common pattern in the function "length" used above. Here's another example of the same pattern: dec sum : list ( num ) -> num ; --- sum ( nil ) <= 0 ; --- sum ( n :: l ) <= n + sum ( l ) ; The underlying pattern consists of processing each element in the list and accumulating a single value that forms the result. In "sum", each element contributes its value to the final result. In "length" the contribution is always "1" irrespective of the type or value of the element, but the pattern is identical. Functions that display this pattern are of type: ( list ( alpha ) -> beta ) In the function definition, the equation for a non-empty list parameter will specify an operation whose result is a "beta". This is "+" in the case of "length" and "sum". One argument of the operation will be a list element and the other will be defined by a recursive call, so the type of the operation needs to be: ( alpha # beta -> beta ) This operation differs between applications, so it will be supplied as a parameter. Finally we need a parameter of type "beta" to specify the base case result. The final version of the function is usually known as "reduce". In the following definition, the symbol "!" introduces a comment, which finishes with another "!" or with a newline: dec reduce : list ( alpha ) # ! the input list ! ( alpha # beta -> beta ) # ! the reduction operation ! beta ! the base case result ! -> beta ; ! the result type ! --- reduce ( nil, f, b ) <= b ; --- reduce ( n :: l, f, b ) <= f ( n, reduce ( l, f, b ) ) ; Hope Tutorial Page 26 To use "reduce" as a replacement for "sum" we'll need to supply the standard function "+" as an actual parameter. We can do this if we prefix it with the symbol nonop to tell the compiler we don't want to use it as an infix operator: reduce ( [ 1,2,3 ], nonop +, 0 ) ; 6 : num When we use "reduce" as a replacement for "length", we're not interested in the first argument of the reduction operation because we always add "1" whatever the list element is. This function ignores its first argument: dec addone : alpha # num -> num ; --- addone ( _ , n ) <= n + 1 ; We use "_" to represent any argument we don't want to refer to. reduce ( "a map they could all understand", addone, 0 ) ; 31 : num Like "map", "reduce" is much more powerful than it first appears because the reduction function needn't define a scalar. Here's a candidate that inserts an object into an ordered list of the same kind of object: dec insert : alpha # list ( alpha ) -> list ( alpha ) ; --- insert ( i, nil ) <= i :: nil ; --- insert ( i, h :: t ) <= if i < h then i :: ( h :: t ) else h :: insert ( i, t ) ; Hope Tutorial Page 27 Actually this isn't strictly polymorphic as its declaration suggests, because it uses the built-in function "<", which is only defined over numbers and characters, but it shows the kind of thing we can do. When we use it to reduce a list of characters: reduce ( "All sorts and conditions of men", insert, nil ) ; " Aacddefiillmnnnnoooorssstt" : list ( char ) we can see that it actually sorts them. The sorting method (insertion sort) isn't very efficient, but the example shows something of the power of higher-order functions and of "reduce" in particular. It's even possible to use "reduce" to get the effect of "map", but that's left as an exercise for the reader as they say. Of course "map" and "reduce" only work on "list ( alpha )" and we'll need to provide different versions for our own structured data types. This is the preferred style of Hope programming, because it makes programs largely independent of the 'shape' of the data structures they use. Here's an alternative kind of binary tree that holds data at its nodes rather than its tips, and a reduce function for it: data tree ( alpha ) == empty ++ node ( tree ( alpha ) # alpha # tree ( alpha ) ) ; dec redtree : tree ( alpha ) # ( alpha # beta -> beta ) # beta -> beta ; --- redtree ( empty, f, b ) <= b ; --- redtree ( node ( l, v, r ), f, b ) <= redtree ( l, f, f ( v, redtree ( r, f, b ) ) ) ; We can use this kind of tree to define a more efficient sort. An ordered binary tree has the property that all the objects in its left subtree logically precede the node object and all those in its right subtree are equal to the node object or logically succeed it. We can build an ordered tree if we have a function to insert new objects into an already-ordered tree, such as: dec instree : alpha # tree ( alpha ) -> tree ( alpha ) ; --- instree ( i, empty ) <= node ( empty, i, empty ) ; --- instree ( i, node ( l, v, r ) ) <= if i < v then node ( instree ( i, l ), v, r ) else node ( l, v, instree ( i, r ) ) ; Hope Tutorial Page 28 We can sort a list by converting its elements into an ordered tree using "instree", then flattening the tree back into a list. This is very easy to specify using the two kinds of reduction we've defined: dec sort : list ( alpha ) -> list ( alpha ) ; --- sort ( l ) <= redtree( reduce ( l, instree, empty ), nonop ::, nil ) sort ( "Mad dogs and Englishmen" ) ; " EMaadddegghilmnnnoss" : list ( char ) Anonymous functions When we used "map" and "reduce", we had to define extra functions like "fact" and "square" to pass in as paramteters. This is a nuisance if we don't need them anywhere else in the program and especially if they're trivial, like "sum" or "addone". For on-the-spot use in cases like this, we can use an anonymous function called a lambda-expression. Here's a lambda-expression corresponding to "sum": lambda ( x, y ) => x + y The symbol "lambda" introduces the function and "x" and "y" are its formal parameters. The expression "x + y" is the function definition. The part after "lambda" is just a recursion equation without the "---" and with "=>" instead of "<=". Here's another lambda-expression used as the actual parameter of "reduce": reduce ( [ "toe","tac","tic" ], lambda ( a, b ) => b <> a, nil ) ; "tictactoe" : list ( char ) There can be more than one recursion equation in the lambda- expression. They're separated from each other by the symbol "|" and pattern-matching is used to select the appropriate one. Here's an example that uses pattern-matching in a lambda-expression to avoid division by zero when the function it defines is executed: map ( [ 1,0,2,0,3 ], lambda ( 0 ) => 0 | ( succ ( n ) ) => 100 div succ ( n ) ) ; [ 100,0,50,0,33 ] : list ( num ) Hope Tutorial Page 29 Functions that create new functions As we've seen, Hope functions possess 'full rights' and can be passed as actual parameters like any data object. It'll be no surprise to learn that we can return a function as the result of another function. The result can be a named function or an anonymous function defined by a lambda-expression. Here's a simple example: dec makestep : num -> ( num -> num ) ; --- makestep ( i ) <= lambda ( x ) => i + x ; makestep ( 3 ) ; lambda ( x ) => 3 + x : num -> num As we can see from trying "makestep", its result is an anonymous function that adds a fixed quantity to its single argument. The size of the increment was specified as an actual parameter to "makestep" when the new function was created, and has become 'bound in' to its definition. If we try the new function, we'll see that it really does add "3" to its actual parameter: makestep ( 3 ) ( 10 ) ; 13 : num There are two applications here. First we apply "makestep" to "3", then the resulting anonymous function is applied to "10". Finally, here's a function that has functions as both actual parameter and result: dec twice : ( alpha -> alpha ) -> ( alpha -> alpha ) ; --- twice ( f ) <= lambda ( x ) => f ( f ( x ) ) ; "twice" defines a new function that has a single argument and some other function "f" bound into its definition. The new function has the same type as "f". We can see its effect using a simple function like "square": twice ( square ) ; lambda ( x ) => square( square ( x ) ) : num -> num twice ( square ) ( 3 ) ; 81 : num Hope Tutorial Page 30 The new function applies the bound-in function to its argument twice. We can even bind in "twice" itself, generating a new function that behaves like "twice" except that the function eventually bound in will be applied four times: twice ( twice ) ; lambda ( x ) => twice ( twice ( x ) ) : ( alpha -> alpha ) -> ( alpha -> alpha ) twice ( twice ) ( square ) ( 3 ) ; 43046721 : num Hope Tutorial Page 31 In conclusion In this article you've been introduced to the ideas of functional programming through one of the new generation of functional languages. You saw how a Hope program is just a series of functions that are regarded as definitions of parts of a data structure - the 'results' of the program - and how the powerful idea of higher-order functions allows us to capture many common program patterns in a single function. Some of these ideas will already be familiar to users of LISP, but they appear in a purer form in Hope, because there are no mechanisms for updating data structures like LISP's SETQ and RPLACA or for specifying the order of evaluation like GO and PROG. Unlike LISP programs, Hope programs are free from side-effects and possess the mathematical property of referential transparency. You also saw features that are primitive or lacking in LISP and in most imperative languages. The data declaration lets you define complex data types without worrying about how they're represented and pattern- matching lets you decompose them, so you can use abstract data types directly without writing access procedures and without the need to invent lots of new names. The typing mechanism lets the compiler check that you're using data objects in a correct and consistent way, while the idea of polymorphic types stops the checking from being too restrictive and lets you define common data 'shapes' with a single function. Higher-order functions and polymorphic types allow us to write programs that are very concise. Programmers are more productive and their programs are easier to understand and to reason about. The property of referential transparency improves our ability to reason about programs still further and makes it possible to transform them mechanically into programs that are still provably correct, but more efficient in their use of space or time. Finally, referential transparency keeps the meaning of Hope programs independent of the order they're evaluated in, making them ideal for parallel evaluation on suitable machines. You'll be seeing a lot more of Hope and languages like it in the future. Hope Tutorial Page 32